A General, Sound and Efficient Natural Language Parsing Algorithm based on Syntactic Constraints Propagation

نویسنده

  • Jose F. Quesada
چکیده

This paper presents a new context-free parsing algorithm based on a bidirectional strictly horizontal strategy which incorporates strong top–down predictions (derivations and adjacencies). From a functional point of view, the parser is able to propagate syntactic constraints reducing parsing ambiguity. From a computational perspective, the algorithm includes different techniques aimed at the improvement of the manipulation and representation of the structures used. 1 Parsing Ambiguity and Parsing Efficiency In Formal Language Theory [Aho & Ullman 1972, Drobot 1989] a language is a set, and in Set Theory an element belongs or not to a set. That is to say, a set (and therefore a language) is an unambiguous structure. A grammar may be considered as an intensive definition of a language. Thus, the notion of grammaticality corresponds to the relation of membership over a language (set). But a grammar incorporates more information than the simple report of the elements of the language (the extensive specification). A grammar defines a structure: the parse tree or forest. The distance between grammaticality and grammatical structure is a first level of ambiguity: grammatical ambiguity. The next notion to take into account is the process of analysis of a string of words with a grammar, that is, the parser [Kay 1980, Bolc 1987, Sikkel & Nijholt 1997]. A parser must be able to determine the relation of grammaticality and to obtain the grammatical structure, by mean of a set of operations, that we will call the parsing structure. The distance between the grammatical structure and the parsing structure defines a second level of ambiguity: parsing ambiguity, usually referred as temporal ambiguity. Parsing ambiguity depends on two factors: the grammar and the parsing strategy. A very important design requirement of natural language parsers is to eliminate parsing ∗Jose F. Quesada: A General, Sound and Efficient Natural Language Parsing Algorithm based on Syntactic Constraints Propagation. Proceedings of CAEPIA’97, M/’alaga, Spain. 775–786

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Coordinated Morphological and Syntactic Analysis of Japanese Language

A method for parallel morphological and syntactic analysis of Japanese language is proposed. Parallel syntactic analysis is based on an efficient parallel LR parsing algorithm for general context-free grammars. It handles syntactic features as constraints. Each syntactic feature is defined by a verbal sub-categorization and attached to a special set of phrases called bunsetsu in Japanese. The b...

متن کامل

L* Parsing: A General Framework for Syntactic Analysis of Natural Language

We describe a new algorithm for table-driven parsing with context-free grammars designed to support efficient syntactic analysis of natural language. The algorithm provides a general framework in which a variety of parser control strategies can be freely specified: bottom-up strategies, top-down strategies, and strategies that strike a balance between the two. The framework permits better shari...

متن کامل

Fence - An Efficient Parser with Ambiguity Support for Model-Driven Language Specification

Model-based language specification has applications in the implementation of language processors, the design of domain-specific languages, model-driven software development, data integration, text mining, natural language processing, and corpus-based induction of models. Model-based language specification decouples language design from language processing and, unlike traditional grammar-driven ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cmp-lg/9801005  شماره 

صفحات  -

تاریخ انتشار 1998